{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# An RNN for short-term predictions\n",
    "This model will try to predict the next value in a short sequence based on historical data. This can be used for example to forecast demand based on a couple of weeks of sales data.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "Things to do:<br/>\n",
    "<ol start=\"0\">\n",
    "<li> Run the notebook. Initially it uses a linear model (the simplest one). Look at the RMSE (Root Means Square Error) metrics at the end of the training and how they compare against a couple of simplistic models: random predictions (RMSErnd), predict same as last value (RMSEsal), predict based on trend from two last values (RMSEtfl).\n",
    "<li> Now implement the DNN (Dense Neural Network) model [here](#assignment1) using `tf.layers.dense`. See how it performs.\n",
    "<li> Swap in the CNN (Convolutional Neural Network) model [here](#assignment2). It is already implemented in function CNN_model. See how it performs.\n",
    "<li> Implement the RNN model [here](#assignment3) using a single `tf.nn.rnn_cell.GRUCell(RNN_CELLSIZE)`. See how it performs.\n",
    "<li> Make the RNN cell 2-deep [here](#assignment4) using `tf.nn.rnn_cell.MultiRNNCell`. See if this improves things. Try also training for 10 epochs instead of 5.\n",
    "<li> You can now go and check out the solutions in file [00_RNN_predictions_solution.ipynb](00_RNN_predictions_solution.ipynb). Its final cell has a loop that benchmarks all the neural network architectures. Try it and then if you have the time, try reducing the data sequence length from 16 to 8 (SEQLEN=8) and see if you can still predict the next value with so little context.\n",
    "</ol>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import utils_datagen\n",
    "import utils_display\n",
    "from matplotlib import pyplot as plt\n",
    "import tensorflow as tf\n",
    "print(\"Tensorflow version: \" + tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate fake dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "DATA_SEQ_LEN = 1024*128\n",
    "data = np.concatenate([utils_datagen.create_time_series(waveform, DATA_SEQ_LEN) for waveform in utils_datagen.Waveforms])\n",
    "utils_display.picture_this_1(data, DATA_SEQ_LEN)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hyperparameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "NB_EPOCHS = 5       # number of times the data is repeated during training\n",
    "RNN_CELLSIZE = 32   # size of the RNN cells\n",
    "SEQLEN = 16         # unrolled sequence length\n",
    "BATCHSIZE = 32      # mini-batch size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualize training sequences\n",
    "This is what the neural network will see during training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "utils_display.picture_this_2(data, BATCHSIZE, SEQLEN) # execute multiple times to see different sample sequences"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The model definition\n",
    "When executed, these functions instantiate the Tensorflow graph for our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# three simplistic predictive models: can you beat them ?\n",
    "def simplistic_models(X):\n",
    "    # \"random\" model\n",
    "    Yrnd = tf.random_uniform([tf.shape(X)[0]], -2.0, 2.0) # tf.shape(X)[0] is the batch size\n",
    "    # \"same as last\" model\n",
    "    Ysal = X[:,-1]\n",
    "    # \"trend from last two\" model\n",
    "    Ytfl = X[:,-1] + (X[:,-1] - X[:,-2])\n",
    "    return Yrnd, Ysal, Ytfl"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# linear model (RMSE: 0.36, with shuffling: 0.17)\n",
    "def linear_model(X):\n",
    "    Yout = tf.layers.dense(X, 1) # output shape [BATCHSIZE, 1]\n",
    "    return Yout"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment1\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #1**: Implement the DNN (Dense Neural Network) model using a single `tf.layers.dense` layer. Do not forget to use the DNN_model function when [instantiating the model](#instantiate)\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 2-layer dense model (RMSE: 0.15-0.18, if training data is not shuffled: 0.38)\n",
    "def DNN_model(X):\n",
    "    # X shape [BATCHSIZE, SEQLEN]\n",
    "    \n",
    "    # --- dummy model: please implement a real one ---\n",
    "    # to test it, do not forget to use this function (DNN_model) when instantiating the model\n",
    "    Y = X * tf.Variable(tf.ones([]), name=\"dummy1\") # Y shape [BATCHSIZE, SEQLEN]\n",
    "    # --- end of dummy model ---\n",
    "    \n",
    "    Yout = tf.layers.dense(Y, 1, activation=None) # output shape [BATCHSIZE, 1]. Predicting vectors of 1 element.\n",
    "    return Yout"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment2\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #2**: Swap in the CNN (Convolutional Neural Network) model. It is already implemented in function CNN_model below so all you have to do is read through the CNN_model code and then use the CNN_model function when [instantiating the model](#instantiate).\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# convolutional (RMSE: 0.31, with shuffling: 0.16)\n",
    "def CNN_model(X):\n",
    "    X = tf.expand_dims(X, axis=2) # [BATCHSIZE, SEQLEN, 1] is necessary for conv model\n",
    "    Y = tf.layers.conv1d(X, filters=8, kernel_size=4, activation=tf.nn.relu, padding=\"same\") # [BATCHSIZE, SEQLEN, 8]\n",
    "    Y = tf.layers.conv1d(Y, filters=16, kernel_size=3, activation=tf.nn.relu, padding=\"same\") # [BATCHSIZE, SEQLEN, 8]\n",
    "    Y = tf.layers.conv1d(Y, filters=8, kernel_size=1, activation=tf.nn.relu, padding=\"same\") # [BATCHSIZE, SEQLEN, 8]\n",
    "    Y = tf.layers.max_pooling1d(Y, pool_size=2, strides=2)  # [BATCHSIZE, SEQLEN//2, 8]\n",
    "    Y = tf.layers.conv1d(Y, filters=8, kernel_size=3, activation=tf.nn.relu, padding=\"same\")  # [BATCHSIZE, SEQLEN//2, 8]\n",
    "    Y = tf.layers.max_pooling1d(Y, pool_size=2, strides=2)  # [BATCHSIZE, SEQLEN//4, 8]\n",
    "    # mis-using a conv layer as linear regression :-)\n",
    "    Yout = tf.layers.conv1d(Y, filters=1, kernel_size=SEQLEN//4, activation=None, padding=\"valid\") # output shape [BATCHSIZE, 1, 1]\n",
    "    Yout = tf.squeeze(Yout, axis=-1) # output shape [BATCHSIZE, 1]\n",
    "    return Yout"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment3\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #3**: Implement the RNN (Recurrent Neural Network) model using `tf.nn.rnn_cell.GRUCell` and `tf.nn.dynamic_rnn`. Do not forget to use the RNN_model_N function when [instantiating the model](#instantiate).</div>\n",
    "\n",
    "<a name=\"assignment4\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #4**: Make the RNN cell 2-deep [here](#assignment2) using `tf.nn.rnn_cell.MultiRNNCell`. See if this improves things. Try also training for 10 epochs instead of 5. Finally try to compute the loss on the last n elemets of the predicted sequence instead of the last (n=SEQLEN//2 for example). Do not forget to use the RNN_model_N function when [instantiating the model](#instantiate).\n",
    "</div>\n",
    "\n",
    "![deep RNN schematic](../../../docs/images/RNN1.svg)\n",
    "<div style=\"text-align: right; font-family: monospace\">\n",
    "  X shape [BATCHSIZE, SEQLEN, 1]<br/>\n",
    "  Y shape [BATCHSIZE, SEQLEN, 1]<br/>\n",
    "  H shape [BATCHSIZE, RNN_CELLSIZE*NLAYERS]\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# RNN model (RMSE: 0.38, with shuffling 0.14, the same with loss on last 8)\n",
    "def RNN_model(X, n=1):\n",
    "    X = tf.expand_dims(X, axis=2) # shape [BATCHSIZE, SEQLEN, 1] is necessary for RNN model\n",
    "    batchsize = tf.shape(X)[0] # allow for variable batch size\n",
    "    \n",
    "    # --- dummy model: please implement a real RNN model ---\n",
    "    # to test it, do not forget to use this function (RNN_model) when instantiating the model\n",
    "    Yn = X * tf.ones([RNN_CELLSIZE], name=\"dummy2\") # Yn shape [BATCHSIZE, SEQLEN, RNN_CELLSIZE]\n",
    "    # TODO: create a tf.nn.rnn_cell.GRUCell\n",
    "    # TODO: unroll the cell using tf.nn.dynamic_rnn(..., dtype=tf.float32)\n",
    "    # --- end of dummy model ---\n",
    "    \n",
    "    # This is the regression layer. It is already implemented.\n",
    "    # Yn [BATCHSIZE, SEQLEN, RNN_CELLSIZE]\n",
    "    Yn = tf.reshape(Yn, [batchsize*SEQLEN, RNN_CELLSIZE])\n",
    "    Yr = tf.layers.dense(Yn, 1) # Yr [BATCHSIZE*SEQLEN, 1] predicting vectors of 1 element\n",
    "    Yr = tf.reshape(Yr, [batchsize, SEQLEN, 1]) # Yr [BATCHSIZE, SEQLEN, 1]\n",
    "    \n",
    "    # In this RNN model, you can compute the loss on the last predicted item or the last n predicted items\n",
    "    # Last n with n=SEQLEN//2 is slightly better. This is a hyperparameter you can adjust in the RNN_model_N\n",
    "    # function below.\n",
    "    Yout = Yr[:,-n:SEQLEN,:] # last item(s) in sequence: output shape [BATCHSIZE, n, 1]\n",
    "    Yout = tf.squeeze(Yout, axis=-1) # remove the last dimension (1): output shape [BATCHSIZE, n]\n",
    "    return Yout"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def RNN_model_N(X): return RNN_model(X, n=SEQLEN//2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def model_fn(features, labels, model):\n",
    "    X = features # shape [BATCHSIZE, SEQLEN]\n",
    "    \n",
    "    Y = model(X)\n",
    "\n",
    "    last_label = labels[:, -1] # last item in sequence: the target value to predict\n",
    "    last_labels = labels[:, -tf.shape(Y)[1]:SEQLEN] # last p items in sequence (as many as in Y), useful for RNN_model(X, n>1)\n",
    "\n",
    "    loss = tf.losses.mean_squared_error(Y, last_labels) # loss computed on last label(s)\n",
    "    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)\n",
    "    train_op = optimizer.minimize(loss)\n",
    "    Yrnd, Ysal, Ytfl = simplistic_models(X)\n",
    "    eval_metrics = {\"RMSE\": tf.sqrt(loss),\n",
    "                    # compare agains three simplistic predictive models: can you beat them ?\n",
    "                    \"RMSErnd\": tf.sqrt(tf.losses.mean_squared_error(Yrnd, last_label)),\n",
    "                    \"RMSEsal\": tf.sqrt(tf.losses.mean_squared_error(Ysal, last_label)),\n",
    "                    \"RMSEtfl\": tf.sqrt(tf.losses.mean_squared_error(Ytfl, last_label))}\n",
    "    \n",
    "    Yout = Y[:,-1]\n",
    "    return Yout, loss, eval_metrics, train_op"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# prepare training dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# training to predict the same sequence shifted by one (next value)\n",
    "labeldata = np.roll(data, -1)\n",
    "# slice data into sequences\n",
    "traindata = np.reshape(data, [-1, SEQLEN])\n",
    "labeldata = np.reshape(labeldata, [-1, SEQLEN])\n",
    "\n",
    "# also make an evaluation dataset by randomly subsampling our fake data\n",
    "EVAL_SEQUENCES = DATA_SEQ_LEN*4//SEQLEN//4\n",
    "joined_data = np.stack([traindata, labeldata], axis=1) # new shape is [N_sequences, 2(train/eval), SEQLEN]\n",
    "joined_evaldata = joined_data[np.random.choice(joined_data.shape[0], EVAL_SEQUENCES, replace=False)]\n",
    "evaldata = joined_evaldata[:,0,:]\n",
    "evallabels = joined_evaldata[:,1,:]\n",
    "\n",
    "def datasets(nb_epochs):\n",
    "    # Dataset API for batching, shuffling, repeating\n",
    "    dataset = tf.data.Dataset.from_tensor_slices((traindata, labeldata))\n",
    "    dataset = dataset.repeat(NB_EPOCHS)\n",
    "    dataset = dataset.shuffle(DATA_SEQ_LEN*4//SEQLEN) # important ! Number of sequences in shuffle buffer: all of them\n",
    "    dataset = dataset.batch(BATCHSIZE)\n",
    "    \n",
    "    # Dataset API for batching\n",
    "    evaldataset = tf.data.Dataset.from_tensor_slices((evaldata, evallabels))\n",
    "    evaldataset = evaldataset.repeat()\n",
    "    evaldataset = evaldataset.batch(EVAL_SEQUENCES) # just one batch with everything\n",
    "\n",
    "    # Some boilerplate code...\n",
    "    \n",
    "    # this creates a Tensorflow iterator of the correct type and shape\n",
    "    # compatible with both our training and eval datasets\n",
    "    tf_iter = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)\n",
    "    # it can be initialized to iterate through the training dataset\n",
    "    dataset_init_op = tf_iter.make_initializer(dataset)\n",
    "    # or it can be initialized to iterate through the eval dataset\n",
    "    evaldataset_init_op = tf_iter.make_initializer(evaldataset)\n",
    "    # Returns the tensorflow nodes needed by our model_fn.\n",
    "    features, labels = tf_iter.get_next()\n",
    "    # When these nodes will be executed (sess.run) in the training or eval loop,\n",
    "    # they will output the next batch of data.\n",
    "\n",
    "    # Note: when you do not need to swap the dataset (like here between train/eval) just use\n",
    "    # features, labels = dataset.make_one_shot_iterator().get_next()\n",
    "    # TODO: easier with tf.estimator.inputs.numpy_input_fn ???\n",
    "    \n",
    "    return features, labels, dataset_init_op, evaldataset_init_op"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"instantiate\"></a>\n",
    "## Instantiate the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tf.reset_default_graph() # restart model graph from scratch\n",
    "# instantiate the dataset\n",
    "features, labels, dataset_init_op, evaldataset_init_op = datasets(NB_EPOCHS)\n",
    "# instantiate the model\n",
    "Yout, loss, eval_metrics, train_op = model_fn(features, labels, linear_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize Tensorflow session\n",
    "This resets all neuron weights and biases to initial random values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# variable initialization\n",
    "sess = tf.Session()\n",
    "init = tf.global_variables_initializer()\n",
    "sess.run(init)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The training loop\n",
    "You can re-execute this cell to continue training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "count = 0\n",
    "losses = []\n",
    "indices = []\n",
    "sess.run(dataset_init_op)\n",
    "while True:\n",
    "    try: loss_, _ = sess.run([loss, train_op])\n",
    "    except tf.errors.OutOfRangeError: break\n",
    "    # print progress\n",
    "    if count%300 == 0:\n",
    "        epoch = count // (DATA_SEQ_LEN*4//BATCHSIZE//SEQLEN)\n",
    "        print(\"epoch \" + str(epoch) + \", batch \" + str(count) + \", loss=\" + str(loss_))\n",
    "    if count%10 == 0:\n",
    "        losses.append(np.mean(loss_))\n",
    "        indices.append(count)\n",
    "    count += 1\n",
    "    \n",
    "# final evaluation\n",
    "sess.run(evaldataset_init_op)\n",
    "eval_metrics_, Yout_ = sess.run([eval_metrics, Yout])\n",
    "print(\"Final accuracy on eval dataset:\")\n",
    "print(str(eval_metrics_))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.ylim(ymax=np.amax(losses[1:])) # ignore first value(s) for scaling\n",
    "plt.plot(indices, losses)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# execute multiple times to see different sample sequences\n",
    "utils_display.picture_this_3(Yout_, evaldata, evallabels, SEQLEN)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2018 Google LLC\n",
    "\n",
    "Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "you may not use this file except in compliance with the License.\n",
    "You may obtain a copy of the License at\n",
    "[http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)\n",
    "Unless required by applicable law or agreed to in writing, software\n",
    "distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "See the License for the specific language governing permissions and\n",
    "limitations under the License."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
